Using the Web as a Linguistic Resource to Automatically Correct Lexico-Syntactic Errors
نویسندگان
چکیده
This paper presents an algorithm for correcting language errors typical of second-language learners. We focus on preposition errors, which are very common among second-language learners but are not addressed well by current commercial grammar correctors and editing aids. The algorithm takes as input a sentence containing a preposition error (and possibly other errors as well), and outputs the correct preposition for that particular sentence context. We use a two-phase hybrid rule-based and statistical approach. In the first phase, rulebased processing is used to generate a short expression that captures the context of use of the preposition in the input sentence. In the second phase, Web searches are used to evaluate the frequency of this expression, when alternative prepositions are used instead of the original one. We tested this algorithm on a corpus of 133 French sentences written by intermediate second-language learners, and found that it could address 69.9% of those cases. In contrast, we found that the best French grammar and spell checker currently on the market, Antidote, addressed only 3% of those cases. We also showed that performance degrades gracefully when using a corpus of frequent ngrams to evaluate frequencies.
منابع مشابه
The role of Persian causative markers in the acquisition of English causative verbs
This project investigates the relationship between lexical semantics and causative morphology in the acquisition of causative/inchoative-related verbs in English as a foreign language by Iranian speakers. Results of translation and picture judgment task show although L2 learners have largely acquired the correct lexico-syntactic classification of verbs in English, they were constrained by ...
متن کاملVariation and Semantic Relation Interpretation: Linguistic and Processing Issues
Studies in linguistics define lexico-syntactic patterns to characterize the linguistic utterances that can be interpreted with semantic relations. Because patterns are assumed to reflect linguistic regularities that have a stable interpretation, several software implement such patterns to extract semantic relations from text. Nevertheless, a thorough analysis of pattern occurrences in various c...
متن کاملText Mining for Causal Relations
Given a semantic relation, the automatic extraction of linguistic patterns that express that relation is a rather difficult problem. This paper presents a semi-automatic method of discovering generally applicable lexico-syntactic patterns that refer to the causal relation. The patterns are found automatically, but their validation is done semi-automatically.
متن کاملOntology Enrichment for the Food Traceability Domain Using Romanian Lexico-syntactic Patterns
Ontologies are considered as the most important building blocks of semantic Web. Building such ontologies is a time consuming and difficult task, which requires a high degree of human intervention. In this paper we describe a method to facilitate the enrichment of Romanian language domain taxonomies by using a text-mining approach. We exploit Romanian domain specific texts in order to automatic...
متن کاملSyntactic Transfer Using a Bilingual Lexicon
We consider the problem of using a bilingual dictionary to transfer lexico-syntactic information from a resource-rich source language to a resource-poor target language. In contrast to past work that used bitexts to transfer analyses of specific sentences at the token level, we instead use features to transfer the behavior of words at a type level. In a discriminative dependency parsing framewo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008